NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Can MLLMs Perform Text-to-Image In-Context Learning?

Zeng, Yuchen; Kang, Wonjun; Chen, Yicong; Koo, Hyung Il; Lee, Kangwook (October 2024, CONFERENCE ON LANGUAGE MODELING 2024)

The evolution from Large Language Models (LLMs) to Multimodal Large Language Models (MLLMs) has spurred research into extending In-Context Learning (ICL) to its multimodal counterpart. Existing such studies have primarily concentrated on image-to-text ICL. However, the Text-to-Image ICL (T2I-ICL), with its unique characteristics and potential applications, remains underexplored. To address this gap, we formally define the task of T2I-ICL and present CoBSAT, the first T2I-ICL benchmark dataset, encompassing ten tasks. Utilizing our dataset to benchmark six state-of-the-art MLLMs, we uncover considerable difficulties MLLMs encounter in solving T2I-ICL. We identify the primary challenges as the inherent complexity of multimodality and image generation, and show that strategies such as fine-tuning and Chain-of-Thought prompting help to mitigate these difficulties, leading to notable improvements in performance. Our code and dataset are available at https://github.com/UW-Madison-Lee-Lab/CoBSAT.
more » « less
Full Text Available
Equal Improvability: A New Fairness Notion Considering the Long-term Impact

Guldogan, Ozgur; Zeng, Yuchen; Sohn, Jy-yong; Pedarsani, Ramtin; Lee, Kangwook (February 2023, International Conference on Learning Representations (ICLR))
EQUAL IMPROVABILITY: A NEW FAIRNESS NOTION CONSIDERING THE LONG-TERM IMPACT

Guldogan, Ozgur; Zeng, Yuchen; Sohn, Jy-yong; Pedarsani, Ramtin; Lee, Kangwook (January 2023, International Conference on Learning Representations (ICLR) 2023)

Full Text Available
Unsupervised Domain Alignment Based Open Set Structural Recognition of Macromolecules Captured By Cryo-Electron Tomography

https://doi.org/10.1109/ICIP42928.2021.9506205

Zeng, Yuchen; Howe, Gregory; Yi, Kai; Zeng, Xiangrui; Zhang, Jing; Chang, Yi-Wei; Xu, Min (September 2021, 2021 IEEE International Conference on Image Processing (ICIP))

Full Text Available
Multiway clustering via tensor block models

Wang, Miaoyan; Zeng, Yuchen (December 2019, Advances in Neural Information Processing Systems 32 (NeurIPS))

We consider the problem of identifying multiway block structure from a large noisy tensor. Such problems arise frequently in applications such as genomics, recommendation system, topic modeling, and sensor network localization. We propose a tensor block model, develop a unified least-square estimation, and obtain the theoretical accuracy guarantees for multiway clustering. The statistical convergence of the estimator is established, and we show that the associated clustering procedure achieves partition consistency. A sparse regularization is further developed for identifying important blocks with elevated means. The proposal handles a broad range of data types, including binary, continuous, and hybrid observations. Through simulation and application to two real datasets, we demonstrate the outperformance of our approach over previous methods.
more » « less
Full Text Available

Search for: All records